Skip to content

Conversation

@DavidBert
Copy link
Collaborator

@DavidBert DavidBert commented Sep 26, 2025

Mirage pipeline

Fixes # (issue)

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

sayakpaul and others added 2 commits September 26, 2025 21:50
* slight edits to the attention backends docs.

* Update docs/source/en/optimization/attention_backends.md

Co-authored-by: Steven Liu <[email protected]>

---------

Co-authored-by: Steven Liu <[email protected]>
…uggingface#12393)

* start unbloating docstrings (save_lora_weights).

* load_lora_weights()

* lora_state_dict

* fuse_lora

* unfuse_lora

* load_lora_into_transformer
Copy link
Collaborator

@photoroman photoroman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a brief initial pass. This looks really great! Thank you!

The main missing part for me are more docs. I think we should add docstrings to all public classes and methods. Also we should add some docs and examples in docs/source/en/api/pipelines/mirage.md, probably.

if vae_type == "flux":
config_path = "/raid/shared/storage/home/davidb/diffusers/diffusers_pipeline_checkpoints/pipeline_checkpoint_fluxvae_gemmaT5_updated/transformer/config.json"
elif vae_type == "dc-ae":
config_path = "/raid/shared/storage/home/davidb/diffusers/diffusers_pipeline_checkpoints/pipeline_checkpoint_dcae_gemmaT5_updated/transformer/config.json"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we change these hardcoded paths?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely!

#!/usr/bin/env python3
"""
Script to convert Mirage checkpoint from original codebase to diffusers format.
"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we release this script or put it into the computer vision repo and release the converted checkpoints?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the idea is to keep this only internally, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at other scripts in this folder, it seems that most companies actually put these kinds of scripts in diffusers.

mapping = {}

# RMSNorm: scale -> weight
for i in range(16): # 16 layers
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use a MIRAGE_NUM_LAYERS: int = 16 constant at the top?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I change it to come from a config instead.

if vae_type == "flux":
ref_pipeline = "/raid/shared/storage/home/davidb/diffusers/diffusers_pipeline_checkpoints/pipeline_checkpoint_fluxvae_gemmaT5_updated"
else: # dc-ae
ref_pipeline = "/raid/shared/storage/home/davidb/diffusers/diffusers_pipeline_checkpoints/pipeline_checkpoint_dcae_gemmaT5_updated"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we change these hardcoded paths?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed all dependency to previous ref pipeline, this was a mistake.

return mapping


def convert_checkpoint_parameters(old_state_dict: dict) -> dict:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably good to use more specific types, like Dict[str, str]. Import Dict from typing, because the diffusers library supports Python 3.8 and built-in types with generics, e.g. dict[str, str] are only supported since Python 3.9.

logger = logging.get_logger(__name__)


def get_image_ids(bs: int, h: int, w: int, patch_size: int, device: torch.device) -> Tensor:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Use more readable variable names: batch_size, height, width.
  • Add docstrings for all public methods and classes.

assert attention_mask.dim() == 2, f"Unsupported attention_mask shape: {attention_mask.shape}"
assert attention_mask.shape[-1] == l_txt, (
f"attention_mask last dim {attention_mask.shape[-1]} must equal text length {l_txt}"
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these be checked as if conditions and raised as ValueError? Usually asserts are to catch programming errors not for input validation.

vae ([`AutoencoderKL`] or [`AutoencoderDC`]):
Variational Auto-Encoder (VAE) Model to encode and decode images to and from latent representations.
Supports both AutoencoderKL (8x compression) and AutoencoderDC (32x compression).
"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in addition to this, we need to add docs about Mirage in docs/source/en/api/pipelines/mirage.md. Have a look at the section "Adding a new pipeline/scheduler" in docs/README.md.


# 0. Default height and width to transformer config
height = height or 256
width = width or 256
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: put 256 into a constant.

Copy link
Collaborator Author

@DavidBert DavidBert Sep 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I specify it in the checkpoint config now and changed it for a constant when not specified

)

# Convert back to image format
from ...models.transformers.transformer_mirage import seq2img
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason not to import this at the top?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been becoming quite keen of doing in-code imports for cases like this one, where seq2img is only used once in the whole file :D

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved it to the top.

"""


class MiragePipeline(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember Eliot started a thread about maybe changing the name of Mirage to something else, since there's already a ML model called like that. Did that end up on something?

sayakpaul and others added 15 commits September 29, 2025 13:08
…ChromaPipeline (huggingface#12263)

* fix: preserve boolean dtype for attention masks in ChromaPipeline

- Convert attention masks to bool and prevent dtype corruption
- Fix both positive and negative mask handling in _get_t5_prompt_embeds
- Remove float conversion in _prepare_attention_mask method

Fixes huggingface#12116

* test: add ChromaPipeline attention mask dtype tests

* test: add slow ChromaPipeline attention mask tests

* chore: removed comments

* refactor: removing redundant type conversion

* Remove dedicated dtype tests as per  feedback

---------

Co-authored-by: Dhruv Nair <[email protected]>
…uggingface#12275)

* feat: support aobaseconfig classes.

* [docs] AOBaseConfig (huggingface#12302)

init

Co-authored-by: Sayak Paul <[email protected]>

* up

* replace with is_torchao_version

* up

* up

---------

Co-authored-by: Steven Liu <[email protected]>
* init

* feedback

---------

Co-authored-by: Sayak Paul <[email protected]>
* init

* config

* lora metadata

* feedback

* fix

* cache allocator warmup for from_single_file

* feedback

* feedback
* fix

* add mellon node registry

* style

* update docstring to include more info!

* support custom node mellon

* HTTPErrpr -> HfHubHTTPErrpr

* up

* Update src/diffusers/modular_pipelines/qwenimage/node_utils.py
…ependency (huggingface#12352)

* update

* update

* update

* update

* update
fix xpu ut failures w/ latest pytorch

Signed-off-by: Yao, Matrix <[email protected]>
…ormers from main (huggingface#12395)

* Allow prerelease when installing transformers from main

* maybe better

* maybe better

* and now?

* just bored

* should be better

* works now
* init

* feedback

* feedback

* feedback

* feedback

* feedback

* feedback
* change syntax

* make style
Copy link
Collaborator

@photoroman photoroman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a quick look. Great! I trust Claude renamed everything correctly.

pipe.to("cuda")

prompt = "A digital painting of a rusty, vintage tram on a sandy beach"
prompt = "A vibrant night sky filled with colorful fireworks, with one large firework burst forming the glowing text “Photon” in bright, sparkling light"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha, awesome!

linoytsaban and others added 26 commits October 7, 2025 18:27
* fix bug when offload and cache_latents both enabled

* fix bug when offload and cache_latents both enabled

* fix bug when offload and cache_latents both enabled

* fix bug when offload and cache_latents both enabled

* fix bug when offload and cache_latents both enabled

* fix bug when offload and cache_latents both enabled

* fix bug when offload and cache_latents both enabled

* fix bug when offload and cache_latents both enabled

* fix bug when offload and cache_latents both enabled
* fix dockerfile definitions.

* python 3.10 slim.

* up

* up

* up

* up

* up

* revert pr_tests.yml changes

* up

* up

* reduce python version for torch 2.1.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.